The Qualtrics team earmarked all of the problematic cases for us (duplicates, speeders, possible bots etc.). These have all been scrubbed. I’ve also conducted some cleaning of the raw data myself. There were a few little things that needed to be fixed up (e.g. respondents with age “1987” rather than “33”). In the end total sample size was 906. But not major concerns there.
To begin I have performed a few quick sanity checks for data quality, and to double check we didn’t have any troubling chance differences in major demographics between our 4 randomised groups.
All completed surveys that were returned with a completion time of less than half the median (i.e. less than 262 seconds) have been culled by the Qualtrics team. This is a standard procedure Qualtrics run to ensure data quality on all their online surveys. However, rather than use the completion time for the full survey as is usually done, I asked the Qualtrics team to run this from the first substantive question to the last substantive question, leaving out the time spent reading the article at the beginning of the survey. This was done so that we could safely use the same exclusion threshold for each the intervention groups as well as the control group who did not read an article (thus would have had a faster completion time).
Descriptive statistics for completion time are displayed in the table and plot below. For convenience I’ve included both seconds and minutes. The boxplot displays the median and IQR, alongside a histogram of counts and raw data (small crosses).
| Time | Min | Max | Median | Mean | SD |
|---|---|---|---|---|---|
| Seconds | 262.000000 | 45462.0 | 576.0 | 833.48347 | 1848.79010 |
| Minutes | 4.366667 | 757.7 | 9.6 | 13.89139 | 30.81317 |
Following these data cleaning measures we were left with a sample of 906 participants. Group sizes (n) were relatively similar after randomisation and data cleaning, and did not appear to be distorted by data cleaning measures. I performed a Bayesian multinomial regression to estimate the probability of being assigned to each group in the final data set.
| Group | n | Observed Prop. | p | .lower | .upper | .width | .point | .interval |
|---|---|---|---|---|---|---|---|---|
| Control | 234 | 25.828 | 0.258 | 0.230 | 0.285 | 0.95 | median | hdi |
| Brain | 228 | 25.166 | 0.252 | 0.224 | 0.279 | 0.95 | median | hdi |
| Design | 224 | 24.724 | 0.247 | 0.220 | 0.275 | 0.95 | median | hdi |
| Clubs | 220 | 24.283 | 0.243 | 0.216 | 0.271 | 0.95 | median | hdi |
No obvious cause for alarm there. All posterior intervals overlapped p = 0.25, the value we would expect if groups were distributed symmetrically. A few more participants in the control group. This could be due to the fact that this condition was less onerous (no reading), but any such differences were small, and not reliable enough to worry about.
Here are the posterior intervals on a plot:
The sample was “soft” stratified to approximately balance the number of participants identifying as Male (436) and Female (468) against population data. Of the 2 remaining participants one self described as “non-binary”, and the other did not submit a text response.
Participants were aged between 18 and 89. To ensure a diverse range of age groups were recruited participants were also stratified to each of the age groups listed in the table below during sampling.
| Age Quotas | n |
|---|---|
| 18-24 | 77 |
| 25-34 | 175 |
| 35-44 | 165 |
| 45-54 | 156 |
| 55-64 | 145 |
| 65+ | 188 |
The women in our were younger on average (mean age = 44.83) than the men (mean age = 50.39).
A simple logistic regression suggested that Age predicted Gender, such that older participants in this sample were more likely to have identified as Male. If we wanted to investigate either age or gender as covariates, it might be best to consider both.
# This time modelling using the rethinking package:
m.GenderAge <-
ulam(
alist(
Male ~ dbern(p),
logit(p) <- alpha + b*Age,
# I'm using some fairly mild default priors
# See McElreath's Statistical Rethinking Chapter 11 for an intro
alpha ~ dnorm(0, 1.5),
b ~ dnorm(0, 1)
), data = dat_list, cores = 8, chains = 8, iter = 1750, warmup = 500
)
## Errors about initial values aren't something to worry about in this instance
gather_draws(m.GenderAge, alpha, b) |>
median_hdi() |>
kable(digits = 3, caption = "Logistic Regression: Gender by Age (LogOdds)")
| .variable | .value | .lower | .upper | .width | .point | .interval |
|---|---|---|---|---|---|---|
| alpha | -0.994 | -1.368 | -0.588 | 0.95 | median | hdi |
| b | 0.019 | 0.012 | 0.027 | 0.95 | median | hdi |
Self-reported gender was approximately evenly distribution across groups:
m.GenderGroup <-
ulam(
alist(
Male ~ dbern(p),
logit(p) <- alpha[Group],
alpha[Group] ~ dnorm(0, 1.5)
), data = dat_list, cores = 8, chains = 8, iter = 1750, warmup = 500
)
d |>
count(Group, Gender) |>
group_by(Group) |>
mutate(observed.prop = n/sum(n)) |>
filter(Gender == "Male") -> obs
post <- tidy_draws(m.GenderGroup)
# Long form
post <-
post |>
select(.draw, alpha.1:alpha.4) |>
pivot_longer(cols = alpha.1:alpha.4, names_to = "k", names_prefix = "alpha.", values_to = "alpha")
# Produce summary table
post |>
# Rename groups
mutate(Group = factor(k, levels = 1:4, labels = levels(d$Group))) |>
# Compute probabilities (logistic transform)
mutate(p = plogis(alpha)) |>
group_by(Group) |>
median_hdi(p) -> post.sum
left_join(obs, post.sum) |>
select(Group, n, observed.prop, p, .lower, .upper) |>
kable(digits = 3, caption = "Oberseved Proportions and Probability Estimates for Identifying as Male, by Group")
| Group | n | observed.prop | p | .lower | .upper |
|---|---|---|---|---|---|
| Control | 108 | 0.462 | 0.466 | 0.404 | 0.528 |
| Brain | 109 | 0.478 | 0.478 | 0.415 | 0.544 |
| Design | 105 | 0.469 | 0.469 | 0.402 | 0.533 |
| Clubs | 114 | 0.518 | 0.523 | 0.460 | 0.590 |
post |>
mutate(Group = factor(k, levels = 1:4, labels = levels(d$Group))) |>
mutate(p = plogis(alpha)) |>
ungroup() |>
select(.draw, Group, p) |>
pivot_wider(names_from = Group, values_from = p) |>
mutate(across(Control:Clubs, ~.x / Control)) |>
pivot_longer(Control:Clubs, names_to = "Group", values_to = "OR") |>
mutate(Group = factor(Group, levels(d$Group))) |>
filter(Group != "Control") |>
group_by(Group) |>
median_hdi() |>
kable(digits = 2, caption = "OR vs Control for Male ID by Group")
| Group | OR | .lower | .upper | .width | .point | .interval |
|---|---|---|---|---|---|---|
| Brain | 1.03 | 0.84 | 1.24 | 0.95 | median | hdi |
| Design | 1.01 | 0.83 | 1.21 | 0.95 | median | hdi |
| Clubs | 1.12 | 0.91 | 1.33 | 0.95 | median | hdi |
Mean age did not differ substantially between experimental groups:
| name | value | .lower | .upper | .width | .point | .interval |
|---|---|---|---|---|---|---|
| alpha | 48.19 | 46.20 | 50.06 | 0.95 | median | hdi |
| bBr | -0.73 | -3.46 | 1.98 | 0.95 | median | hdi |
| bCl | -0.62 | -3.31 | 2.22 | 0.95 | median | hdi |
| bD | -1.26 | -3.99 | 1.48 | 0.95 | median | hdi |
Some slight differences in State:
# Predict NSW by Group ---------------------------------------------------------------------------
dat_list <- list(Group = as.integer(d$Group), NSW = ifelse(d$State2 == "New South Wales", TRUE, FALSE))
m.State.Group <- ulam(
alist(
NSW ~ bernoulli(p),
logit(p) <- alpha[Group],
alpha[Group] ~ normal(0, 1.5)
), data = dat_list, cores = 8, chains = 8, iter = 1750, warmup = 500
)
d |>
count(Group, State2) |>
group_by(Group) |>
mutate(observed.prop = n/sum(n)) |>
filter(State2 == "New South Wales") -> obs
post <- tidy_draws(m.State.Group)
# Long form
post <-
post |>
select(.draw, alpha.1:alpha.4) |>
pivot_longer(cols = alpha.1:alpha.4, names_to = "Group", names_prefix = "alpha.", values_to = "alpha")
post |>
mutate(Group = factor(Group, levels = 1:4, labels = levels(d$Group)),
p = plogis(alpha)) |>
group_by(Group) |>
median_hdi(p) -> post.sum
left_join(obs, post.sum) |>
select(Group, n, observed.prop, p, .lower, .upper) |>
kable(digits = 3, caption = "Oberseved Proportions and Probability Estimates for NSW Location, by Group")
| Group | n | observed.prop | p | .lower | .upper |
|---|---|---|---|---|---|
| Control | 120 | 0.513 | 0.513 | 0.451 | 0.579 |
| Brain | 106 | 0.465 | 0.465 | 0.400 | 0.531 |
| Design | 92 | 0.411 | 0.411 | 0.348 | 0.475 |
| Clubs | 115 | 0.523 | 0.523 | 0.456 | 0.585 |
Participants were recruited from both NSW (433) and Victoria (473). We collected some simple demographic information about our sample, as displayed below:
| Local Area | n | Percent |
|---|---|---|
| Major City | 661 | 73.0 |
| Inner Regional | 144 | 15.9 |
| Outer Regional | 88 | 9.7 |
| Remote | 12 | 1.3 |
| Very Remote | 1 | 0.1 |
| Education | n | Percent |
|---|---|---|
| Year 10 or below | 83 | 9.2 |
| Year 11 or equivalent | 33 | 3.6 |
| Year 12 or equivalent | 133 | 14.7 |
| A trade, technical certificate or diploma | 227 | 25.1 |
| Undergraduate university degree | 268 | 29.6 |
| Postgraduate degree | 162 | 17.9 |
| Employment | n | Percent |
|---|---|---|
| Yes, full time | 339 | 37.4 |
| Yes, part time | 149 | 16.4 |
| Yes, casual | 44 | 4.9 |
| No, not employed - looking for work | 96 | 10.6 |
| No, not employed - not looking for work | 267 | 29.5 |
| Currently stood-down | 11 | 1.2 |
Our sample included a number of individuals who gambled regularly, and a number who reported symptoms indicative of varying risk of gambling related harm. Again the distribution of PGSI symptoms did not differ significantly between experimental groups.
| Group | No Harm | Low Risk | Moderate Risk | High Risk |
|---|---|---|---|---|
| Control | 162 | 25 | 17 | 30 |
| Brain | 151 | 26 | 20 | 31 |
| Design | 140 | 27 | 29 | 28 |
| Clubs | 147 | 12 | 25 | 36 |
| Total | 600 | 90 | 91 | 125 |
| Test | df | Chi Squared | p.value |
|---|---|---|---|
| Pearson’s Chi-squared test | 9 | 12.8 | 0.17 |
Likewise the number of individuals who had gambled at least once in the prior 12 months was (approximately) evenly distributed across experimental groups:
| Group | No Gambling | Gambling |
|---|---|---|
| Control | 149 | 85 |
| Brain | 162 | 66 |
| Design | 143 | 81 |
| Clubs | 147 | 73 |
| Total | 601 | 305 |
| Test | df | Chi Squared | p.value |
|---|---|---|---|
| Pearson’s Chi-squared test | 3 | 3.66 | 0.3 |
There were some issues with data quality in the responses to gambling frequency. For instance it seems a number of participants misunderstood the question time window: “On how many days in the last 12 months have you gambled money in each of the following ways.”
The 20 highest overall number below. Given there are only 365 days in a year some of these responses are clearly troubling. It’s possible that some individuals interpreted this as the total number of bets (such as spins on a poker machine, rather than the total number of days where betting occurred at least once). It’s also possible that these participants were responding at random, made keystroke errors or were satisficing.
| Cards | Racing | SportsBetting | Dice | SkillGame | Pokies | Bingo | CasinoGames | LotteryScratch | None |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 5000 | 2500 | 0 | 0 | 1500 | 0 | 1000 | 200 | FALSE |
| 0 | 1000 | 0 | 0 | 0 | 2000 | 0 | 6000 | 500 | FALSE |
| 0 | 0 | 0 | 1000 | 0 | 1500 | 0 | 0 | 0 | FALSE |
| 0 | 0 | 0 | 0 | 0 | 1500 | 0 | 0 | 500 | FALSE |
| 0 | 0 | 2000 | 0 | 0 | 0 | 0 | 0 | 0 | FALSE |
| 0 | 0 | 0 | 0 | 0 | 1549 | 0 | 0 | 0 | FALSE |
| 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 1220 | FALSE |
| 0 | 50 | 0 | 0 | 0 | 1000 | 20 | 0 | 10 | FALSE |
| 0 | 300 | 350 | 0 | 0 | 25 | 0 | 55 | 200 | FALSE |
| 200 | 0 | 200 | 0 | 0 | 0 | 200 | 0 | 200 | FALSE |
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 800 | FALSE |
| 0 | 600 | 0 | 0 | 0 | 0 | 0 | 0 | 120 | FALSE |
| 0 | 0 | 0 | 0 | 0 | 0 | 50 | 0 | 500 | FALSE |
| 0 | 25 | 0 | 0 | 0 | 0 | 0 | 0 | 500 | FALSE |
| 0 | 0 | 0 | 0 | 0 | 500 | 0 | 0 | 0 | FALSE |
| 0 | 0 | 0 | 0 | 0 | 500 | 0 | 0 | 0 | FALSE |
| 0 | 350 | 0 | 0 | 0 | 125 | 0 | 20 | 0 | FALSE |
| 0 | 150 | 0 | 0 | 0 | 20 | 0 | 0 | 300 | FALSE |
| 0 | 0 | 0 | 0 | 0 | 200 | 100 | 0 | 100 | FALSE |
| 0 | 200 | 200 | 0 | 0 | 0 | 0 | 0 | 0 | FALSE |
At this stage I’m not sure we plan to use this information, but good to be aware of.
We also asked participants: “What is the largest amount of money you have ever gambled with in any one day?”. Unsurprisingly, responses to this questions were highly skewed. The highest reported lifetime daily expenditure was $1 million. I’ve drawn two plots below, capped at $2000 (200 bins) and $100 (101 bins), to display the spread:
There’s also a fairly classic preference for factors of 5. Given that distribution, and how skewed the means are it’s probably not a good idea to run parametric tests using these data. Again, I’ve no direct plans to use this information at this stage.
| Group | median | mean | sd |
|---|---|---|---|
| Control | 24 | 256.3219 | 1036.670 |
| Brain | 20 | 707.3158 | 7515.908 |
| Design | 20 | 4788.7946 | 66814.097 |
| Clubs | 20 | 287.3409 | 1100.824 |
Participants in each group appeared to spend about the same amount of time reading in each condition. There is however a lot of variation in reading time overall, even if this doesn’t vary by group, it certainly does by participant.
Note that we set a timer on the page so participants could not press next until at least 45 seconds had passed. We also had 19 missing values on this measure.
| Group | Min | 5% | 25% | Median | 75% | 95% | Max | Mean | SD |
|---|---|---|---|---|---|---|---|---|---|
| Brain | 63 | 73.1 | 103.0 | 149 | 235.0 | 551.2 | 4415 | 227.8 | 343.4 |
| Design | 62 | 70.8 | 106.0 | 164 | 245.8 | 411.2 | 1831 | 202.5 | 168.5 |
| Clubs | 59 | 71.0 | 106.2 | 153 | 223.8 | 524.8 | 4016 | 239.8 | 387.6 |
Across each of the above quantiles reading times were reasonably similar. Although, the median reading time in the Design group was slightly higher than the remaining two groups. Notably, there were more extreme observations (> 1000 seconds) out in the tail for the Brain and Clubs conditions:
d |>
group_by(Group) |>
filter(Group != "Control" & !is.na(SpeedInterv)) |>
count(SpeedInterv > 600, Group) |>
kable()
| Group | SpeedInterv > 600 | n |
|---|---|---|
| Brain | FALSE | 215 |
| Brain | TRUE | 8 |
| Design | FALSE | 210 |
| Design | TRUE | 6 |
| Clubs | FALSE | 204 |
| Clubs | TRUE | 10 |
There is also some clear positive skew on each of these data distributions. This is unsurprising. We are measuring time. Measures of time commonly display this kind of positive skew. The measure is also bounded above 45 seconds (we set a minimum reading time on the article page for all condition). A standard regression isn’t going to cut it here. So I’m going to want to transform these data.
I might also need to windsorise some of the more extreme values. I could probably make a case for listwise deletion of these cases, 4000 seconds is over an hour. It seems unlikely that someone was that interested in the the content of the article. More likely they got up to do something else. But they may have nonetheless paid attention, and I don’t like deleting data without a clear indication that there was something clearly wrong with the cases.
Taking a quick look the failure rate on our comprehension check was about the same among these more extreme cases, as it was in the data set overall.
# 10 minutes
d |>
group_by(Group, TotalCheck) |>
filter(Group != "Control" & !is.na(SpeedInterv)) |>
count(SpeedInterv > 600) |>
filter(`SpeedInterv > 600` == TRUE) |>
group_by(Group) |>
mutate(prop = n/sum(n)) |>
kable()
| Group | TotalCheck | SpeedInterv > 600 | n | prop |
|---|---|---|---|---|
| Brain | Correct | TRUE | 7 | 0.8750000 |
| Brain | Failed | TRUE | 1 | 0.1250000 |
| Design | Correct | TRUE | 5 | 0.8333333 |
| Design | Failed | TRUE | 1 | 0.1666667 |
| Clubs | Correct | TRUE | 9 | 0.9000000 |
| Clubs | Failed | TRUE | 1 | 0.1000000 |
# 7 minutes
d |>
group_by(Group, TotalCheck) |>
filter(Group != "Control" & !is.na(SpeedInterv)) |>
count(SpeedInterv > 420) |>
filter(`SpeedInterv > 420` == TRUE) |>
group_by(Group) |>
mutate(prop = n/sum(n)) |>
kable()
| Group | TotalCheck | SpeedInterv > 420 | n | prop |
|---|---|---|---|---|
| Brain | Correct | TRUE | 17 | 0.8500000 |
| Brain | Failed | TRUE | 3 | 0.1500000 |
| Design | Correct | TRUE | 8 | 0.8888889 |
| Design | Failed | TRUE | 1 | 0.1111111 |
| Clubs | Correct | TRUE | 14 | 0.8750000 |
| Clubs | Failed | TRUE | 2 | 0.1250000 |
I’ll hold onto these cases rather than dropping them from the primary analysis.
As I mentioned I’ll transform these data prior to analysis to account for the positive skew. A log normal is likely a much better assumption than a normal. I will also standardise the data following the log transform. This will make life easier when it comes to setting priors. Finally, given that we know that data must be larger than 45 seconds, I’m going to shift the data back by 45 seconds, so that the measure represents time from the limit expiring.
# I've switched off warnings and results output for this stan code.
m.RTZ <- ulam(
alist(
RT_z ~ dnorm(mu, sigma),
mu <- alpha[G],
alpha[G] ~ dnorm(0, 1), # Data have been standardised, which makes setting our prior fairly easy
# Potential Unequal Variance (Equal variance not assumed).
sigma <- s[G],
s[G] ~ dexp(1)
),
data = dat_list,
cores = 8, chains = 8, iter = 2750, warmup = 1000, log_lik = TRUE,
)
# Posterior draws
post <- tidy_draws(m.RTZ) |> select(.draw, alpha.1:s.3)
# Long form
post <-
post |>
pivot_longer(alpha.1:s.3, names_to = ".variable", values_to = ".value") |>
# Some regex to recover group and variable types
mutate(
Group = str_extract(string = .variable, pattern = "(?<=\\.)[:digit:]"),
.variable = str_extract(string = .variable, pattern = "[:alpha:]*(?=\\.)"))
post |>
group_by(Group, .variable) |>
median_hdi(.value) |>
arrange(.variable) |>
mutate(Group = factor(Group, labels = c("Brain", "Design", "Clubs")),
.variable = factor(.variable, levels = c("alpha", "s"), labels = c("Mean", "SD"))) |>
rename(param = .variable) |>
kable(digits = 2, caption = "Analysis of Mean Reading Time by Group")
| Group | param | .value | .lower | .upper | .width | .point | .interval |
|---|---|---|---|---|---|---|---|
| Brain | Mean | 0.00 | -0.13 | 0.13 | 0.95 | median | hdi |
| Design | Mean | 0.00 | -0.13 | 0.13 | 0.95 | median | hdi |
| Clubs | Mean | 0.00 | -0.14 | 0.14 | 0.95 | median | hdi |
| Brain | SD | 1.02 | 0.93 | 1.12 | 0.95 | median | hdi |
| Design | SD | 0.95 | 0.86 | 1.04 | 0.95 | median | hdi |
| Clubs | SD | 1.04 | 0.95 | 1.14 | 0.95 | median | hdi |
Clearly no difference on the standardised scale for mean reading time by group. A little less variation in the Design condition, though our posterior interval still includes near null values:
post |>
filter(.variable == "s") |>
mutate(Group = factor(Group, levels = 1:3, labels = c("Brain", "Design", "Clubs"))) |>
pivot_wider(names_from = "Group", values_from = ".value") |>
mutate(across(c("Brain", "Clubs"), .fns = ~(.x - Design))) |>
ungroup() |>
select(.draw, Brain, Clubs) |>
pivot_longer(cols = Brain:Clubs) |>
group_by(name) |>
median_hdi() |>
kable(digits = 2, caption = "Difference in SD vs. Design Group")
| name | value | .lower | .upper | .width | .point | .interval |
|---|---|---|---|---|---|---|
| Brain | 0.07 | -0.06 | 0.20 | 0.95 | median | hdi |
| Clubs | 0.09 | -0.04 | 0.22 | 0.95 | median | hdi |
A simple check of each of these sensitivity analyses is the posterior predictive check, which compares predictions from each model to the observed data. I’ve done this below:
# I'll write a function to get this done
get_ppc <- function(model, thisMany = 100) {
d2 |> count(Group) -> ppc
posterior <- extract.samples(model)
ppc$G <- 1:3
ppc <-
ppc |>
mutate(
sample = list(1:thisMany),
alpha = purrr::map(
.x = G,
.f = function(.x) {
posterior$alpha[1:thisMany, .x]
}
),
s = purrr::map(
.x = G,
.f = function(.x) {
posterior$s[1:thisMany, .x]
})) |>
unnest(everything()) |>
mutate(
RT = purrr::pmap(
.l = list(n = n, mean = alpha, sd = s),
.f = rnorm
)
)
}
# Posterior predictive check:
ppc_Z <- get_ppc(m.RTZ)
back_to_sec <- function(x) {
x <- ((x * sd(dat_list$RT_lg)) + mean(dat_list$RT_lg))
x <- exp(x)
return(x)
}
ppc_Z <-
ppc_Z |>
select(Group, n, G, sample, RT) |>
unnest(sample) |>
unnest(RT) |>
mutate(RT = back_to_sec(RT))
# Log transform ---------
ggplot() +
theme_bw() +
coord_cartesian(xlim = c(0, 1200)) +
facet_wrap(~Group, nrow = 3) +
geom_density(data = ppc_Z,
mapping = aes(x = RT, group = sample),
colour=alpha("hotpink1", 0.2)) +
geom_density(data = d2,
mapping = aes(x = RT, group = Group),
colour = "black") +
labs(subtitle = "Log-Transform and Standardised",
caption = "Density plot of observed data displayed in black
\nDenisty plots over predicted data in hot pink ;)",
y = "Density")
Some nice looking fits.
To wrap up I’ll summarise a posterior check from the sample using the full set of posterior draws from the log-transform model, on the measurement scale:
ppc_Z <- get_ppc(model = m.RTZ, thisMany = 1e4)
ppc_Z <- ppc_Z |> unnest(RT)
ppc_Z <-
ppc_Z |>
mutate(RT = back_to_sec(RT)) |>
group_by(Group, sample) |>
summarise(
".05" = quantile(RT, probs = c(.05)),
".25" = quantile(RT, probs = c(.25)),
".5" = quantile(RT, probs = c(.5)),
".75" = quantile(RT, probs = c(.75)),
".95" = quantile(RT, probs = c(.95)))
ppc_Z <-
ppc_Z |>
ungroup() |>
group_by(Group) |>
summarise(across(!sample, .fns = list)) |>
mutate(across(!Group, .fns = ~purrr::map(.x = .x, median_hdi)))
ppc_Z <-
ppc_Z |>
mutate(across(!Group, .fns = ~purrr::map_chr(
.x = .x,
.f = function(x) {
x |>
select(y:ymax) |>
mutate(across(everything(), .fns = ~round(.x, 2))) |>
summarise(Est = paste0(y, " [", ymin, ", ", ymax, "]")) |>
pull(Est)
})))
ppc_Z |>
pivot_longer(cols = !Group, names_to = "Quantile") |>
pivot_wider(names_from = Group, values_from = value) |>
kable(digits = 2, caption = "Quantile summary of posterior distribution")
| Quantile | Brain | Design | Clubs |
|---|---|---|---|
| .05 | 25.27 [17.89, 33.2] | 28.07 [20.39, 36.75] | 24.56 [17, 32.52] |
| .25 | 59.93 [47.77, 72.65] | 62.87 [50.77, 75.31] | 59.27 [46.86, 72.59] |
| .5 | 110.15 [90.5, 132.79] | 110.83 [91.99, 131.72] | 110.55 [90.51, 134.13] |
| .75 | 203 [162.86, 246.58] | 195.28 [157.45, 234.49] | 205.35 [164.37, 253.91] |
| .95 | 481.71 [351.99, 646.58] | 436.74 [325.66, 579.8] | 496.23 [350.31, 667.02] |
Overall I don’t see any reason to be concerned with these observations, any differences in reading times were small or negligible. Fewer extreme observations in the Design group. Windsorizing extreme scores may address this. But this is not a central concern, so long as we have good evidence that each group appeared to spend enough time reading the article.
I’ve saved the most important check for last. Following the intervention article participants were asked to respond to a simple comprehension check. This check asked participants to identify which of four sentences best described the article they had viewed. Each group were presented with the same 3 false answers, in addition to a unique question that correctly represented the article that had been viewed. 69.35 percent of participants responded correctly to the first check, 10.71 % asked to view the questions again, and 19.94% answered the question incorrectly. Those that failed or requested to view the article again were again shown the article page followed by another manipulation check question, that differed from the first. Overall 87.8 % of respondents answered the first or second manipulation check correctly, representing a high level of comprehension and attention.
d |>
filter(Group != "Control") |>
mutate(Group = factor(Group, levels = c("Brain", "Design", "Clubs"))) |>
select(Group, TotalCheck) |>
mutate(TotalCheck = TotalCheck == "Correct") -> dat_list
m.Attention <- ulam(
alist(
TotalCheck ~ bernoulli(p),
logit(p) <- a[Group],
a[Group] ~ dnorm(0, 1.5)
), data = dat_list, cores = 8, chains = 8, iter = 2750, warmup = 1000)
post <- tidy_draws(m.Attention) |> select(.draw, a.1:a.3)
# Compute probs
post <- post |> mutate(across(a.1:a.3, .fns = plogis))
post <-
post |>
pivot_longer(cols = a.1:a.3, names_to = "Group", values_to = "p") |>
mutate(Group = str_extract(Group, pattern = "(?!>\\.)[:digit:]"),
Group = factor(Group, labels = c("Design", "Brain", "Clubs")))
post |>
group_by(Group) |>
median_hdi(p) |>
kable(digits = 2)
| Group | p | .lower | .upper | .width | .point | .interval |
|---|---|---|---|---|---|---|
| Design | 0.87 | 0.82 | 0.91 | 0.95 | median | hdi |
| Brain | 0.88 | 0.83 | 0.92 | 0.95 | median | hdi |
| Clubs | 0.88 | 0.84 | 0.92 | 0.95 | median | hdi |
The success rate on these measures is more or less exactly the same between experimental groups.
McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan (2nd ed.). Taylor and Francis, CRC Press.
Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and computing, 27(5), 1413-1432.
sessionInfo()
## R version 4.1.2 (2021-11-01)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Monterey 12.1
##
## Matrix products: default
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] digest_0.6.29 tidybayes.rethinking_3.0.0
## [3] tidybayes_3.0.2 posterior_1.2.0
## [5] rethinking_2.21 cmdstanr_0.4.0.9001
## [7] rstan_2.21.3 StanHeaders_2.21.0-7
## [9] janitor_2.1.0 broom_0.7.12
## [11] knitr_1.37 ggridges_0.5.3
## [13] forcats_0.5.1 stringr_1.4.0
## [15] dplyr_1.0.8 purrr_0.3.4
## [17] readr_2.1.2 tidyr_1.2.0
## [19] tibble_3.1.6 ggplot2_3.3.5
## [21] tidyverse_1.3.1 datapasta_3.1.0
## [23] here_1.0.1
##
## loaded via a namespace (and not attached):
## [1] matrixStats_0.61.0 fs_1.5.2 lubridate_1.8.0
## [4] httr_1.4.2 rprojroot_2.0.2 tensorA_0.36.2
## [7] tools_4.1.2 backports_1.4.1 bslib_0.3.1
## [10] utf8_1.2.2 R6_2.5.1 ggdist_3.1.0
## [13] DBI_1.1.2 colorspace_2.0-3 withr_2.4.3
## [16] tidyselect_1.1.2 gridExtra_2.3 prettyunits_1.1.1
## [19] processx_3.5.2 compiler_4.1.2 cli_3.2.0
## [22] rvest_1.0.2 HDInterval_0.2.2 arrayhelpers_1.1-0
## [25] xml2_1.3.3 labeling_0.4.2 sass_0.4.0
## [28] scales_1.1.1 checkmate_2.0.0 mvtnorm_1.1-3
## [31] callr_3.7.0 rmarkdown_2.11 pkgconfig_2.0.3
## [34] htmltools_0.5.2 highr_0.9 dbplyr_2.1.1
## [37] fastmap_1.1.0 rlang_1.0.1 readxl_1.3.1
## [40] rstudioapi_0.13 svUnit_1.0.6 shape_1.4.6
## [43] jquerylib_0.1.4 generics_0.1.2 farver_2.1.0
## [46] jsonlite_1.7.3 distributional_0.3.0 inline_0.3.19
## [49] magrittr_2.0.2 loo_2.4.1 Rcpp_1.0.8
## [52] munsell_0.5.0 fansi_1.0.2 abind_1.4-5
## [55] lifecycle_1.0.1 stringi_1.7.6 yaml_2.3.5
## [58] snakecase_0.11.0 MASS_7.3-55 pkgbuild_1.3.1
## [61] plyr_1.8.6 grid_4.1.2 crayon_1.5.0
## [64] lattice_0.20-45 haven_2.4.3 hms_1.1.1
## [67] ps_1.6.0 pillar_1.7.0 codetools_0.2-18
## [70] stats4_4.1.2 reprex_2.0.1 glue_1.6.1
## [73] evaluate_0.15 data.table_1.14.3 RcppParallel_5.1.5
## [76] modelr_0.1.8 vctrs_0.3.8 tzdb_0.2.0
## [79] cellranger_1.1.0 gtable_0.3.0 assertthat_0.2.1
## [82] xfun_0.29 coda_0.19-4 ellipsis_0.3.2